Aminet 5

home *** CD-ROM | disk | FTP | other *** search

/ Aminet 5 / Aminet 5 - March 1995.iso / Aminet / dev / m2 / Turbo_1.lha / modula / docs / OPTOZS.DOC < prev next >

Wrap

Text File | 1995-01-24 | 4KB | 111 lines

Compiler optimizations ====================== The compiler applies a few easy to implement optimizations to generate small & fast code. The two three (as described below) give the Drystone benchmark a considerable speed boost. These optimizations increase compilation time, but because the compiler compiles itself, the slowdown is more than made up for by the higher quality code. The compiler consists of 14000 lines of (dense) code and compiles itself in around 60 seconds on a 14Mhz MC68020. These optimizations are also performed by the DICE compiler and assembler. The commercial M2Amiga compiler used to bootstrap M2C, generates relatively slower (and larger) code because it does not attempt to do any of the following: Register Allocation ------------------- Simple local variables and parameters are allocated to registers based on wieghted usage. Parameters are copied off the stack and into registers at procedure entry. Locals used inside nested loops are given highest priority. Typically there are 6 data registers (d2-d7) and 4 address registers (a2,a3,a5,a6) available. Procedures that contain complex expressions may have less registers available. Low level procedures (that call no others) can also use d0,d1,a0,a1 to hold locals. Locals that are passed as VAR parameters to other procedures, arguments of SYSTEM.ADR or locals referenced from within nested procedures (so called non-local,non-global variables) cannot be allocated to registers. LINK/UNLINK removal ------------------- If the compiler manages to allocate all parameters and variables for a particular procedure to registers.Then it can remove the LINK & UNLINK instruction pair which normally constructs/destructs a stack frame for variables in that procedure. If this can be done the procedure call overhead is greatly reduced. This is specially important in small procedures eg. when using opaque types we are sometimes forced to use a function just to read the contents of a field. CASE statement -------------- Generating code for the case statements is quite tricky. The compiler must adequately cope with a variety of styles: CASE exp OF |-42 : .. |10000: .. (* This is bad style but is still legal. *) END (* Some (inadequate) compilers will complain if you try this *) or -- CASE exp OF |0: .. |1: .. |2: .. |3: .. (* Case labels are contiguous *) ... (* This is really what the case statement is for *) |100: .. END ; or a combination of the 2 styles. The algorithm for the case statement is: If there are only a few case labels (less than 8 say) then perform a linear search. If the number of case labels is a high proportion (atleat 50 percent say) of MAX(label)-MIN(label) then perform a table lookup to see where to branch to. Otherwise subdivide the labels (into 2 havles) and recursively apply the algorithm. The compiler generates code to implement this algorithm. Branch Optimization ------------------- The M68000 provides a 16bit as well as 32bit branch instruction. When the compiler generates a forward branch its impossible for it to know which to use.To be safe it generates space for a 32bit instruction, if subsequently it turns out that only a 16bit instruction was needed, the code will contain a 2 byte hole. The compiler keeps track of where these holes occur and after code generation compacts all the code. Branch optimization typically reduces code size by about 5 percent. If you spend a few minutes thinking about it, youll realize that performing branch optimization is more complex than it appears. Adressing Models ---------------- The compiler supports (and defaults to) small data (A4 relative) & small code (PC relative) memory models. These models reduce the size of the relocation information in the final executable, i.e. its smaller and loads faster. Future optimizations -------------------- The compiler does not implement registerized parameter passing, this may be added. It doesn't do any 'global' optimizations either because I dont know how :) This is only my second compiler, the first was for a toy programming language (at college).